Validity Index and number of clusters
نویسندگان
چکیده
Clustering (or cluster analysis) has been used widely in pattern recognition, image processing, and data analysis. It aims to organize a collection of data items into c clusters, such that items within a cluster are more similar to each other than they are items in the other clusters. The number of clusters c is the most important parameter, in the sense that the remaining parameters have less influence on the resulting partition. To determine the best number of classes several methods were made, and are called validity index. This paper presents a new validity index for fuzzy clustering called a Modified Partition Coefficient And Exponential Separation (MPCAES) index. The efficiency of the proposed MPCAES index is compared with several popular validity indexes. More information about these indexes is acquired in series of numerical comparisons and also real data Iris.
منابع مشابه
ارایه شاخصی جدید جهت سنجش اعتبار خوشه بندی در الگوریتم های خوشه بندی فازی نوع-2
One of the main issues in fuzzy clustering is to determine the number of clusters that should be available before clustering and selection of different values for the number of clusters will lead to different results. Then, different clusters obtained from different number of clusters should be validated with an index. But so far such an index has not been introduced for interval type-2 fuzzy C...
متن کاملA Novel Validity Index for Determination of the Optimal Number of Clusters
The structural characteristics of clusters are investigated in the partitioning process. Two partition functions, which show opposite properties around the optimal cluster number, are found and a new cluster validity index is presented based on the combination of these functions. Some properties of the index function are discussed and numerical examples are presented. key words: clustering, val...
متن کاملPerformance Evaluation of Some Clustering Algorithms and Validity Indices
In this article, we evaluate the performance of three clustering algorithms, hard K-Means, single linkage, and a simulated annealing (SA) based technique, in conjunction with four cluster validity indices, namely Davies-Bouldin index, Dunn’s index, Calinski-Harabasz index, and a recently developed index I . Based on a relation between the index I and the Dunn’s index, a lower bound of the value...
متن کاملThe role of Tourism clusters on regional competitiveness ( Case study: Religious tourism cluster of Qom)
As a novel idea for discussion on the role of industrial development on regional development, the term “cluster” became noteworthy since 90’s in order to increase competitiveness. Territorial development researchers believe that formation of regional industrial clusters improves competitiveness and plays a role in promoting competitive advantages and regional development. Hence, because of the ...
متن کاملNonparametric Genetic Clustering: Comparison of Validity Indices
Variable string length genetic algorithm (GA) is used for developing a novel nonparametric clustering technique when the number of clusters is not fixed a priori. Chromosomes in the same population may now have different lengths since they encode different number of clusters. The crossover operator is redefined to tackle the concept of variable string length. Cluster validity index is used as a...
متن کامل